知识图(kgs)已证明对于个人助理,提问系统和搜索引擎等应用非常重要。因此,确保其高质量至关重要。但是,公斤不可避免地包含错误,重复和缺失的价值,这可能会阻碍其在业务应用中的收养和实用性,因为它们没有策划,例如,低质量的kgs产生了在其顶部建立的低质量应用程序。在本视觉论文中,我们提出了一个实用的知识图策划框架,以提高KG的质量。首先,我们定义了一组用于评估KGS状态的质量指标,其次,我们将KGS的验证和验证描述为清洁任务,第三,我们提出了重复的检测和知识融合策略,以丰富KGS。此外,我们为策划KGS的更好的建筑提供了见解和方向。
translated by 谷歌翻译
知识图的完整性是重要的质量维度,也是对使用IT的应用程序表现良好的因素。通过执行知识丰富,可以改善完整性。重复检测旨在在知识图的实例之间找到身份联系,并且是知识丰富的基本子任务。当前解决问题的解决方案需要对工具的专家知识及其应用的知识图。用户可能没有这种专家知识。我们介绍了基于服务的重复检测任务的方法,该方法提供了一种易于使用的无代码解决方案,该解决方案仍然与最先进的解决方案竞争,并且最近在工业背景下被采用。评估将基于几种常用的测试方案。
translated by 谷歌翻译
We introduce MegaPose, a method to estimate the 6D pose of novel objects, that is, objects unseen during training. At inference time, the method only assumes knowledge of (i) a region of interest displaying the object in the image and (ii) a CAD model of the observed object. The contributions of this work are threefold. First, we present a 6D pose refiner based on a render&compare strategy which can be applied to novel objects. The shape and coordinate system of the novel object are provided as inputs to the network by rendering multiple synthetic views of the object's CAD model. Second, we introduce a novel approach for coarse pose estimation which leverages a network trained to classify whether the pose error between a synthetic rendering and an observed image of the same object can be corrected by the refiner. Third, we introduce a large-scale synthetic dataset of photorealistic images of thousands of objects with diverse visual and shape properties and show that this diversity is crucial to obtain good generalization performance on novel objects. We train our approach on this large synthetic dataset and apply it without retraining to hundreds of novel objects in real images from several pose estimation benchmarks. Our approach achieves state-of-the-art performance on the ModelNet and YCB-Video datasets. An extensive evaluation on the 7 core datasets of the BOP challenge demonstrates that our approach achieves performance competitive with existing approaches that require access to the target objects during training. Code, dataset and trained models are available on the project page: https://megapose6d.github.io/.
translated by 谷歌翻译
We present Polite Teacher, a simple yet effective method for the task of semi-supervised instance segmentation. The proposed architecture relies on the Teacher-Student mutual learning framework. To filter out noisy pseudo-labels, we use confidence thresholding for bounding boxes and mask scoring for masks. The approach has been tested with CenterMask, a single-stage anchor-free detector. Tested on the COCO 2017 val dataset, our architecture significantly (approx. +8 pp. in mask AP) outperforms the baseline at different supervision regimes. To the best of our knowledge, this is one of the first works tackling the problem of semi-supervised instance segmentation and the first one devoted to an anchor-free detector.
translated by 谷歌翻译
We propose KGTN-ens, a framework extending the recent Knowledge Graph Transfer Network (KGTN) in order to incorporate multiple knowledge graph embeddings at a small cost. We evaluate it with different combinations of embeddings in a few-shot image classification task. We also construct a new knowledge source - Wikidata embeddings - and evaluate it with KGTN and KGTN-ens. Our approach outperforms KGTN in terms of the top-5 accuracy on the ImageNet-FS dataset for the majority of tested settings.
translated by 谷歌翻译
We present a unified and compact representation for object rendering, 3D reconstruction, and grasp pose prediction that can be inferred from a single image within a few seconds. We achieve this by leveraging recent advances in the Neural Radiance Field (NeRF) literature that learn category-level priors and fine-tune on novel objects with minimal data and time. Our insight is that we can learn a compact shape representation and extract meaningful additional information from it, such as grasping poses. We believe this to be the first work to retrieve grasping poses directly from a NeRF-based representation using a single viewpoint (RGB-only), rather than going through a secondary network and/or representation. When compared to prior art, our method is two to three orders of magnitude smaller while achieving comparable performance at view reconstruction and grasping. Accompanying our method, we also propose a new dataset of rendered shoes for training a sim-2-real NeRF method with grasping poses for different widths of grippers.
translated by 谷歌翻译
To apply federated learning to drug discovery we developed a novel platform in the context of European Innovative Medicines Initiative (IMI) project MELLODDY (grant n{\deg}831472), which was comprised of 10 pharmaceutical companies, academic research labs, large industrial companies and startups. The MELLODDY platform was the first industry-scale platform to enable the creation of a global federated model for drug discovery without sharing the confidential data sets of the individual partners. The federated model was trained on the platform by aggregating the gradients of all contributing partners in a cryptographic, secure way following each training iteration. The platform was deployed on an Amazon Web Services (AWS) multi-account architecture running Kubernetes clusters in private subnets. Organisationally, the roles of the different partners were codified as different rights and permissions on the platform and administrated in a decentralized way. The MELLODDY platform generated new scientific discoveries which are described in a companion paper.
translated by 谷歌翻译
在现实世界中,教授多指的灵巧机器人在现实世界中掌握物体,这是一个充满挑战的问题,由于其高维状态和动作空间。我们提出了一个机器人学习系统,该系统可以进行少量的人类示范,并学会掌握在某些被遮挡的观察结果的情况下掌握看不见的物体姿势。我们的系统利用了一个小型运动捕获数据集,并为多指的机器人抓手生成具有多种多样且成功的轨迹的大型数据集。通过添加域随机化,我们表明我们的数据集提供了可以将其转移到策略学习者的强大抓地力轨迹。我们训练一种灵活的抓紧策略,该策略将对象的点云作为输入,并预测连续的动作以从不同初始机器人状态掌握对象。我们在模拟中评估了系统对22多伏的浮动手的有效性,并在现实世界中带有kuka手臂的23多杆Allegro机器人手。从我们的数据集中汲取的政策可以很好地概括在模拟和现实世界中的看不见的对象姿势
translated by 谷歌翻译
任务计划可能需要定义有关机器人需要采取行动的世界的无数领域知识。为了改善这项工作,可以使用大型语言模型(LLM)在任务计划期间为潜在的下一个操作评分,甚至直接生成动作序列,鉴于没有其他域信息的自然语言指令。但是,这样的方法要么需要列举所有可能的下一步评分,要么生成可能包含在当前机器人中给定机器人上不可能操作的自由形式文本。我们提出了一个程序化的LLM提示结构,该结构能够跨越位置环境,机器人功能和任务的计划生成功能。我们的关键见解是提示LLM具有环境中可用操作和对象的类似程序的规格,以及可以执行的示例程序。我们通过消融实验提出了有关迅速结构和生成约束的具体建议,证明了虚拟屋家庭任务中最先进的成功率,并将我们的方法部署在桌面任务的物理机器人组上。网站progprompt.github.io
translated by 谷歌翻译
变形金刚用大型数据集的扩展能力彻底改变了视力和自然语言处理。但是在机器人的操作中,数据既有限又昂贵。我们仍然可以从具有正确的问题制定的变压器中受益吗?我们用Peract进行了调查,这是一种用于多任务6 DOF操纵的语言条件的行为结合剂。 Peract用感知器变压器编码语言目标和RGB-D Voxel观测值,并通过“检测下一个最佳素素动作”来输出离散的动作。与在2D图像上运行的框架不同,体素化的观察和动作空间为有效学习的6-DOF策略提供了强大的结构性先验。通过此公式,我们训练一个单个多任务变压器,用于18个RLBench任务(具有249个变体)和7个现实世界任务(具有18个变体),从每个任务仅几个演示。我们的结果表明,针对各种桌面任务,佩内的磨损明显优于非结构化图像到作用剂和3D Convnet基准。
translated by 谷歌翻译